Liftered forward masking procedure for robust digits recognition
نویسندگان
چکیده
Using TI digits recognition experiments, we show that a combination of two dynamic speech features, Liftered Forward Masked (LFM) MFCC and 2-D cepstrum, can improve system robustness to additive Volvo noise while maintaining system performance comparable to standard MFCC features in clean conditions. Through experiments, we show that the information extracted by forward masking and by the 2D cepstrum are in some sense orthogonal. By combining the LFM MFCC and the 2-D cepstrum plus 2-D cepstrum, we achieve a recognition rate above 90% on the TI connected digits task, even in additive Volvo noise condition with SNR as low as 0dB. This corresponds to a SNR gain over 30dB compared with standard MFCC plus dynamic and acceleration coefficients.
منابع مشابه
Decorrelated and liftered filter-bank energies for robust speech recognition
Though Mel frequency cepstral coeÆcients (MFCCs) have been very successful in speech recognition, they have the following two problems: 1) They do not have any physical interpretation, and 2) Liftering of cepstral coefcients, found to be highly useful in the earlier dynamic warping-based speech recognition systems, has no e ect in the recognition process when used with continuous observation Ga...
متن کاملForward masking on a generalized logarithmic scale for robust speech recognition
This paper examines the forward masking on the generalized logarithmic scale for robust speech recognition to both additive and convolutional noise. The forward masking in the dynamic cepstral (DyC) representation is based upon subtraction of a masking pattern from a current spectrum on a logarithmic spectral domain, whereas the proposed method intends to make a compromise between the logarithm...
متن کاملCepstrum derived from differentiated power spectrum for robust speech recognition
In this paper, cepstral features derived from the differential power spectrum (DPS) are proposed for improving the robustness of a speech recognizer in presence of background noise. These robust features are computed from the speech signal of a given frame through the following four steps. First, the short-time power spectrum of speech signal is computed from the speech signal through the fast ...
متن کاملAn auditory feature extraction method based on forward-masking and its application in robust speaker identification and speech recognition
1 This work is supported by National Nature Science Funds of China, the project number i Abstract: This article presents a new auditory feature extraction method, which considers the forwardmasking mechanism of auditory nerves and feasible in practice. Two features based on this method are extracted: FMFRC (forward masking firing-rate cepstrum) and FMSRC (forward masking synchronized rate cepst...
متن کاملImproved Forward Masking on a Generalized Logarithmic Scale for Robust Speech Recognition
We previously proposed a forward masking on a generalized logarithmic scale to eliminate convolutional noise as well as to suppress additive noise. While the generalized Dynamic Cepstrum derived from the masked spectrum has been robust to both noises, the robustness to convolutional noise slightly degrades as compared to masking on the logarithmic scale, and the optimal masking coefficient depe...
متن کامل